Lab 4 Objectives

  1. Load and plot raster data.
  2. Understand and reproject coordinate reference systems.
  3. Use a csv file with location data to create a spatial map.
  4. Use points (vector data) to extract values from pixels in a raster object.

1. Introduction

In this lab, we’ll work with spatial raster and vector datasets from the Harvard Forest NEON site to assess canopy height from airborne Lidar remotely sensed imagery. We will work through core concepts and considerations for spatial data in R.

First, we’ll load the geospatial and tidyverse libraries that we need for this lab.

# Load libraries
library(raster)
library(rgdal)
library(ggplot2)
library(dplyr)
library(sf)

2. Plotting a raster

We start by working with a series of GeoTIFF files in this lab, which contains a set of embedded tags with metadata about the raster data. We’ll start with a raster of data for the ground surface height (called a digital terrain model or DTM) at Harvard Forest in central MA. We can use the function GDALinfo() to get information about our raster data before we read that data into R. It is ideal to do this before importing your data.

# Look at metadata before loading
GDALinfo("data/NEON-DS-Airborne-Remote-Sensing/HARV/DTM/HARV_dtmCrop.tif")
## rows        1367 
## columns     1697 
## bands       1 
## lower left origin.x        731453 
## lower left origin.y        4712471 
## res.x       1 
## res.y       1 
## ysign       -1 
## oblique.x   0 
## oblique.y   0 
## driver      GTiff 
## projection  +proj=utm +zone=18 +datum=WGS84 +units=m +no_defs 
## file        data/NEON-DS-Airborne-Remote-Sensing/HARV/DTM/HARV_dtmCrop.tif 
## apparent band summary:
##    GDType hasNoDataValue NoDataValue blockSize1 blockSize2
## 1 Float64           TRUE       -9999          1       1697
## apparent band statistics:
##     Bmin   Bmax    Bmean      Bsd
## 1 304.56 389.82 344.8979 15.86147
## Metadata:
## AREA_OR_POINT=Area

Now that we’ve previewed the metadata for our GeoTIFF, let’s import this raster dataset into R and explore its metadata more closely using the raster() function.

# Import DTM data & convert to data frame
DTM_HARV <- raster("data/NEON-DS-Airborne-Remote-Sensing/HARV/DTM/HARV_dtmCrop.tif")

# Look at raster object: shows resolution (size of a pixel), extent
# coordinate reference system (CRS), names (the variable in the cells)
DTM_HARV
## class      : RasterLayer 
## dimensions : 1367, 1697, 2319799  (nrow, ncol, ncell)
## resolution : 1, 1  (x, y)
## extent     : 731453, 733150, 4712471, 4713838  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=18 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0 
## source     : /Users/jmatthes/Google Drive/courses/BISC307/FA20/labs/EcosysEco-FA20-Lab4/data/NEON-DS-Airborne-Remote-Sensing/HARV/DTM/HARV_dtmCrop.tif 
## names      : HARV_dtmCrop 
## values     : 304.56, 389.82  (min, max)
# Look at summary stats for ground elevation
summary(DTM_HARV)
## Warning in .local(object, ...): summary is an estimate based on a sample of 1e+05 cells (4.31% of all cells)
##         HARV_dtmCrop
## Min.          304.67
## 1st Qu.       334.54
## Median        344.40
## 3rd Qu.       357.43
## Max.          389.72
## NA's            0.00

To visualise this data in R using ggplot2, we need to convert it to a data frame using the as.data.frame() function from the raster package. Then we can use ggplot() to plot these data. We will set the color scale to scale_fill_viridis_c which is a color-blindness-friendly color scale. We will also use the coord_quickmap() function to use an approximate Mercator projection for our plots. This approximation is suitable for small areas that are not too close to the poles. Other coordinate systems are available in ggplot2 if needed, you can learn about them at their help page ?coord_map.

# Convert DTM from a raster to a data frame
DTM_HARV_df <- as.data.frame(DTM_HARV, xy = TRUE)

# Look at snapshot of data frame version
head(DTM_HARV_df)
##          x       y HARV_dtmCrop
## 1 731453.5 4713838       389.40
## 2 731454.5 4713838       389.55
## 3 731455.5 4713838       389.48
## 4 731456.5 4713838       389.43
## 5 731457.5 4713838       389.33
## 6 731458.5 4713838       389.23
# Plot HARV digital terrain model (ground elevation) raster 
# Here we fill with HARV_dtmCrop, because this is what we see for 'names' variable
ggplot(data = DTM_HARV_df ) +
    geom_raster(aes(x = x, y = y, fill = HARV_dtmCrop)) +
    scale_fill_viridis_c() +
    coord_quickmap()

This map shows the surface level elevation of our study site in Harvard Forest. From the legend, we can see that the maximum elevation is ~400, but we can’t tell whether this is 400 feet or 400 meters because the legend doesn’t show us the units. We can look at the metadata of our object to see what the units are. Much of the metadata that we’re interested in is part of the coordinate reference system (CRS).

3. Interpreting the CRS in Proj4 format

We can view the CRS string associated with our R object using the crs() function.

# Check CRS for the HARV ground elevation DTM 
crs(DTM_HARV)
## CRS arguments:
##  +proj=utm +zone=18 +datum=WGS84 +units=m +no_defs +ellps=WGS84
## +towgs84=0,0,0

The CRS for our data is given to us by R in proj4 format. Let’s break down the pieces of proj4 string. The string contains all of the individual CRS elements that R or another GIS might need. Each element is specified with a + sign, similar to how a .csv file is delimited or broken up by a ,. After each + we see the CRS element being defined. For example projection (proj=) and datum (datum=).

Our projection string for DSM_HARV specifies the UTM projection as follows:

+proj=utm +zone=18 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0

3. Missing Data & Bad Data Values

Raster data also often has a NoDataValue associated with it. This is a value assigned to pixels where data is missing or no data were collected. The value that is conventionally used to take note of missing data (the NoDataValue value) varies by the raster data type. For integers, -9999 is common. In some cases, other NA values may be more appropriate. An NA value should be a) outside the range of valid values, and b) a value that fits the data type in use.

If we are lucky, our GeoTIFF file has a tag that tells us what is the NoDataValue. If we are less lucky, we can find that information in the raster’s metadata. If a NoDataValue was stored in the GeoTIFF tag, when R opens up the raster, it will assign each instance of the value to NA. Values of NA will be ignored by R as demonstrated above.

We can use the output from the GDALinfo() function to find out what NoDataValue is used for our DTM_HARV dataset:

# Check NoDataValue for DTM raster
GDALinfo("data/NEON-DS-Airborne-Remote-Sensing/HARV/DTM/HARV_dtmCrop.tif")
## rows        1367 
## columns     1697 
## bands       1 
## lower left origin.x        731453 
## lower left origin.y        4712471 
## res.x       1 
## res.y       1 
## ysign       -1 
## oblique.x   0 
## oblique.y   0 
## driver      GTiff 
## projection  +proj=utm +zone=18 +datum=WGS84 +units=m +no_defs 
## file        data/NEON-DS-Airborne-Remote-Sensing/HARV/DTM/HARV_dtmCrop.tif 
## apparent band summary:
##    GDType hasNoDataValue NoDataValue blockSize1 blockSize2
## 1 Float64           TRUE       -9999          1       1697
## apparent band statistics:
##     Bmin   Bmax    Bmean      Bsd
## 1 304.56 389.82 344.8979 15.86147
## Metadata:
## AREA_OR_POINT=Area

Bad data values are different from NoDataValues. Bad data values are values that fall outside of the applicable range of a dataset. Sometimes, we need to use some common sense and scientific insight as we examine the data - just as we would for field data to identify questionable values.

Plotting data with appropriate highlighting can help reveal patterns in bad values and may suggest a solution. Here, we can create a histogram to see the distribution of values in our data, which can help to identify bad values:

# Histogram of HARV DTM raster values
ggplot(data = DTM_HARV_df) +
    geom_histogram(aes(HARV_dtmCrop))
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 1 rows containing non-finite values (stat_bin).


Code Challenge 1: Use GDALinfo() to determine the following about the NEON-DS-Airborne-Remote-Sensing/HARV/DSM/HARV_DSMhill.tif file:


4. Layering rasters & reprojecting

We can layer a raster on top of a hillshade raster for the same area, and use a transparency factor to create a fancy 3-dimensional effect. A hillshade is a raster that maps the shadows and texture that you would see from above when viewing terrain.

But before we do that, we’ll need to load the hillshade data and check to make sure that the CRS projection matches that of the DTM, otherwise they won’t be able to be plotted together.

First we need to read in our DTM hillshade data and view the structure:

# Load hillshade raster
DTM_hill_HARV <-
  raster("data/NEON-DS-Airborne-Remote-Sensing/HARV/DTM/HARV_DTMhill_WGS84.tif")

# Check CRS of the hillshade raster
crs(DTM_hill_HARV)
## CRS arguments:
##  +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
# Does it match the CRS of the DTM data
crs(DTM_HARV)
## CRS arguments:
##  +proj=utm +zone=18 +datum=WGS84 +units=m +no_defs +ellps=WGS84
## +towgs84=0,0,0

Nooooo! The projection for the DTM is in UTM zone 18N, but the hillshade raster is in longlat (longitude/latitude WGS84). We’ll need to convert the projection of the hillshade raster to plot these together.

We can use the projectRaster() function to reproject a raster into a new CRS. Keep in mind that reprojection only works when you first have a defined CRS for the raster object that you want to reproject. It cannot be used if no CRS is defined. Lucky for us, the DTM_hill_HARV has a defined CRS, even if the wrong one.

To use the projectRaster() function, we need to define two things: 1) the object we want to reproject, and 2) the CRS that we want to reproject it to. The syntax is: projectRaster(RasterObject, crs = CRSToReprojectTo)

Within the projectRaster() function we can assign the CRS of our DTM_HARV as follows: crs = crs(DTM_HARV). Note that we are using the projectRaster() function on the raster object, not the data.frame() we use for plotting with ggplot.

First we will reproject our DTM_hill_HARV raster data to match the DTM_HARV raster CRS:

# Reproject hillshade raster to DTM_HARV crs
DTM_hill_UTMZ18N_HARV <- projectRaster(DTM_hill_HARV,
                                       crs = crs(DTM_HARV))
# Check new crs
crs(DTM_hill_UTMZ18N_HARV)
## CRS arguments:
##  +proj=utm +zone=18 +datum=WGS84 +units=m +no_defs +ellps=WGS84
## +towgs84=0,0,0

We can also check the resolution of our reprojected raster, and we can recalculate the resolution for 1m grid cells (it’s close, but not quite 1m):

# Check original resolution
res(DTM_hill_UTMZ18N_HARV)
## [1] 1.000 0.998
# Reproject original raster and constrain to 1m resolution
DTM_hill_UTMZ18N_HARV <- projectRaster(DTM_hill_HARV,
                                  crs = crs(DTM_HARV),
                                  res = 1)

Now we can layer another raster on top of our hillshade by adding another call to the geom_raster() function. Let’s overlay DTM_HARV on top of the hill_HARV.

# Convert to data frame
DTM_hill_HARV_df <- as.data.frame(DTM_hill_UTMZ18N_HARV, xy = TRUE) 

# Plot just the hillshade data
ggplot(data = DTM_hill_HARV_df) +
  geom_raster(aes(x = x, y = y, alpha = HARV_DTMhill_WGS84)) + 
  scale_alpha(range =  c(0.15, 0.65), guide = "none") + 
  coord_quickmap()

And finally, we can plot the ground elevation and hillshade together on the same plot:

# Plot DTM ground elevation and hillshade together
ggplot(data = DTM_HARV_df) +
  geom_raster(aes(x = x, y = y, 
                  fill = HARV_dtmCrop)) + 
  geom_raster(data = DTM_hill_HARV_df, 
              aes(x = x, y = y, 
                  alpha = HARV_DTMhill_WGS84)) +  
  scale_fill_viridis_c() +  
  scale_alpha(range = c(0.15, 0.65), guide = "none") +  
  ggtitle("Elevation with hillshade") +
  coord_quickmap()

5. Raster Calculations

The DTM includes data for the ground surface elevation. Within the data/ folder there is also digital surface model (DSM) data, which is collected by an airborne sensor that measures the very top of each object on the land surface (whether a treetop, pavement, roof, etc.). Both of these datasets were collected by the NEON airborne lidar sensor that you watched a video about for your PreLab.

We will calculate the difference between these two lidar-derived rasters (DSM minus DTM) as a canopy height model (CHM) that represents the height of vegetation at Harvard Forest.

We can calculate the difference between two rasters in two different ways:

  1. directly subtracting the two rasters in R using raster math
  2. using the overlay() function for more efficient processing - particularly if our rasters are large and/or the calculations are complex
# Import the DSM data and check the projection
DSM_HARV <- raster("data/NEON-DS-Airborne-Remote-Sensing/HARV/DSM/HARV_dsmCrop.tif")
crs(DSM_HARV)
## CRS arguments:
##  +proj=utm +zone=18 +datum=WGS84 +units=m +no_defs +ellps=WGS84
## +towgs84=0,0,0
# Raster math: subtract DTM (ground level) from DSM (elevation of objects)
CHM_HARV <- DSM_HARV - DTM_HARV

# Change to data frame for plotting
CHM_HARV_df <- as.data.frame(CHM_HARV, xy = TRUE)

# Plot the new output canopy height model
ggplot(data = CHM_HARV_df) +
   geom_raster(aes(x = x, y = y, fill = layer)) + 
   scale_fill_gradientn(name = "Canopy Height (m)", colors = terrain.colors(10)) + 
   coord_quickmap()

We can look at a distribution of values in the canopy height model to get a sense of the range of canopy heights at this site:

# Plot histogram of Harvard Forest canopy height
ggplot(CHM_HARV_df) +
    geom_histogram(aes(layer))
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 1 rows containing non-finite values (stat_bin).

And we could have equivalently done this raster math with the overlay() function, which takes two or more rasters and applies a function to them. The syntax is: outputRaster <- overlay(raster1, raster2, fun=functionName)

# Use overlay for raster subtraction
CHM_ov_HARV <- overlay(DSM_HARV,
                       DTM_HARV,
                       fun = function(r1, r2) { return( r1 - r2) })

# Convert to data frame
CHM_ov_HARV_df <- as.data.frame(CHM_ov_HARV, xy = TRUE)

# Plot the canopy height model for Harvard Forest
ggplot(data = CHM_ov_HARV_df) +
  geom_raster(aes(x = x, y = y, fill = layer)) + 
  scale_fill_gradientn(name = "Canopy Height", colors = terrain.colors(10)) + 
  coord_quickmap()


Code Challenge 2: Within your group, discuss the results from the Harvard Forest canopy height map. Where are there areas of tall tree canopies? Which part of the map might have the highest rates of photosynthesis? What are the units of the x and y axes?


Now that we’ve created a new raster, we could also export the data as a GeoTIFF file using the writeRaster() function. This would save us from having to do these processing steps each time, which is helpful for rasters that are large. When we write this raster object to a GeoTIFF file we’ll name it CHM_HARV.tiff. The writeRaster() function by default writes the output file to your working directory unless you specify a full file path.

We will specify the output format ("GTiff"), the missing data value (NAflag = -9999). We will also tell R to overwrite any data that is already in a file of the same name.

# Export Harvard Forest Canopy height model raster
writeRaster(CHM_ov_HARV, "CHM_HARV.tiff",
            format="GTiff",
            overwrite=TRUE,
            NAflag=-9999)

6. Importing and plotting spatial point data from a csv file

Now we will use the sf package to work with vector data in R. Today we will work with point vector data, which is common if you collect field data at points and then want to connect those data to remotely sensed imagery.

Here we will import spatial points stored in .csv (Comma Separated Value) format into R as an sf spatial object. The HARV_PlotLocations.csv file contains x, y point locations for study plots where NEON collects data on vegetation and other ecological metrics. If a text data file has associated x and y location columns, then we can convert it into an sf spatial object. The sf object allows us to store both the x,y values that represent the coordinate location of each point and the associated attribute data (columns) describing each feature in the spatial object.

Let’s import a .csv file that contains plot coordinate x, y locations at the NEON Harvard Forest Field Site (HARV_PlotLocations.csv).

# Import HARV NEON plot locations
plot_locations_HARV <- read.csv("data/HARV_PlotLocations.csv")
plot_locations_HARV
##     easting northing geodeticDa utmZone   plotID stateProvi    county
## 1  731405.3  4713456      WGS84     18N HARV_015         MA Worcester
## 2  731934.3  4713415      WGS84     18N HARV_033         MA Worcester
## 3  731754.3  4713115      WGS84     18N HARV_034         MA Worcester
## 4  731724.3  4713595      WGS84     18N HARV_035         MA Worcester
## 5  732125.3  4713846      WGS84     18N HARV_036         MA Worcester
## 6  731634.3  4713295      WGS84     18N HARV_037         MA Worcester
## 7  731754.3  4713295      WGS84     18N HARV_038         MA Worcester
## 8  731784.3  4712845      WGS84     18N HARV_039         MA Worcester
## 9  731904.3  4713595      WGS84     18N HARV_040         MA Worcester
## 10 731514.3  4713235      WGS84     18N HARV_041         MA Worcester
## 11 731874.3  4712965      WGS84     18N HARV_042         MA Worcester
## 12 731844.3  4713715      WGS84     18N HARV_043         MA Worcester
## 13 731874.3  4713085      WGS84     18N HARV_044         MA Worcester
## 14 732275.3  4713846      WGS84     18N HARV_045         MA Worcester
## 15 731514.3  4713355      WGS84     18N HARV_046         MA Worcester
## 16 731934.3  4713235      WGS84     18N HARV_047         MA Worcester
## 17 731724.3  4713745      WGS84     18N HARV_048         MA Worcester
## 18 731604.3  4713055      WGS84     18N HARV_049         MA Worcester
## 19 731634.3  4713175      WGS84     18N HARV_050         MA Worcester
## 20 731664.3  4712935      WGS84     18N HARV_051         MA Worcester
## 21 731844.3  4713835      WGS84     18N HARV_052         MA Worcester
##    domainName domainID siteID    plotType  subtype plotSize elevation
## 1   Northeast      D01   HARV distributed basePlot     1600    331.64
## 2   Northeast      D01   HARV       tower basePlot     1600    341.62
## 3   Northeast      D01   HARV       tower basePlot     1600    347.61
## 4   Northeast      D01   HARV       tower basePlot     1600    334.34
## 5   Northeast      D01   HARV       tower basePlot     1600    352.93
## 6   Northeast      D01   HARV       tower basePlot     1600    342.43
## 7   Northeast      D01   HARV       tower basePlot     1600    343.11
## 8   Northeast      D01   HARV       tower basePlot     1600    353.55
## 9   Northeast      D01   HARV       tower basePlot     1600    338.11
## 10  Northeast      D01   HARV       tower basePlot     1600    343.74
## 11  Northeast      D01   HARV       tower basePlot     1600    351.20
## 12  Northeast      D01   HARV       tower basePlot     1600    339.68
## 13  Northeast      D01   HARV       tower basePlot     1600    348.73
## 14  Northeast      D01   HARV       tower basePlot     1600    358.28
## 15  Northeast      D01   HARV       tower basePlot     1600    340.08
## 16  Northeast      D01   HARV       tower basePlot     1600    345.81
## 17  Northeast      D01   HARV       tower basePlot     1600    336.70
## 18  Northeast      D01   HARV       tower basePlot     1600    348.85
## 19  Northeast      D01   HARV       tower basePlot     1600    345.74
## 20  Northeast      D01   HARV       tower basePlot     1600    351.76
## 21  Northeast      D01   HARV       tower basePlot     1600    343.01
##     soilTypeOr plotdim_m
## 1  Inceptisols        40
## 2  Inceptisols        40
## 3  Inceptisols        40
## 4    Histosols        40
## 5  Inceptisols        40
## 6    Histosols        40
## 7    Histosols        40
## 8  Inceptisols        40
## 9  Inceptisols        40
## 10   Histosols        40
## 11 Inceptisols        40
## 12 Inceptisols        40
## 13 Inceptisols        40
## 14 Inceptisols        40
## 15   Histosols        40
## 16 Inceptisols        40
## 17 Inceptisols        40
## 18 Inceptisols        40
## 19   Histosols        40
## 20 Inceptisols        40
## 21   Histosols        40

To convert the data to an sf object, we’ll need to specify x,y columns and the coordinate reference system (crs) for these data.

Looking at the plot_locations_HARV data frame, we can see that easting is the x-bearing, northing is the y-bearing, geodeticDa is the coordinate reference WGS84, and utmZone defines UTM 18N as the projection. Now we can use the proj4string format to define the crs for our data frame:

# Define UTM 18N WGS84 proj4string
utm18nCRS <- "+proj=utm +zone=18 +ellps=WGS84 +datum=WGS84 +units=m +no_defs"

Next, let’s convert our dataframe into an sf object. To do this, we need to specify:

# Convert data frame to sf object
plot_locations_sp_HARV <- st_as_sf(plot_locations_HARV, coords = c("easting", "northing"), crs = utm18nCRS)

plot_locations_sp_HARV
## Simple feature collection with 21 features and 14 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: 731405.3 ymin: 4712845 xmax: 732275.3 ymax: 4713846
## CRS:            +proj=utm +zone=18 +ellps=WGS84 +datum=WGS84 +units=m +no_defs
## First 10 features:
##    geodeticDa utmZone   plotID stateProvi    county domainName domainID
## 1       WGS84     18N HARV_015         MA Worcester  Northeast      D01
## 2       WGS84     18N HARV_033         MA Worcester  Northeast      D01
## 3       WGS84     18N HARV_034         MA Worcester  Northeast      D01
## 4       WGS84     18N HARV_035         MA Worcester  Northeast      D01
## 5       WGS84     18N HARV_036         MA Worcester  Northeast      D01
## 6       WGS84     18N HARV_037         MA Worcester  Northeast      D01
## 7       WGS84     18N HARV_038         MA Worcester  Northeast      D01
## 8       WGS84     18N HARV_039         MA Worcester  Northeast      D01
## 9       WGS84     18N HARV_040         MA Worcester  Northeast      D01
## 10      WGS84     18N HARV_041         MA Worcester  Northeast      D01
##    siteID    plotType  subtype plotSize elevation  soilTypeOr plotdim_m
## 1    HARV distributed basePlot     1600    331.64 Inceptisols        40
## 2    HARV       tower basePlot     1600    341.62 Inceptisols        40
## 3    HARV       tower basePlot     1600    347.61 Inceptisols        40
## 4    HARV       tower basePlot     1600    334.34   Histosols        40
## 5    HARV       tower basePlot     1600    352.93 Inceptisols        40
## 6    HARV       tower basePlot     1600    342.43   Histosols        40
## 7    HARV       tower basePlot     1600    343.11   Histosols        40
## 8    HARV       tower basePlot     1600    353.55 Inceptisols        40
## 9    HARV       tower basePlot     1600    338.11 Inceptisols        40
## 10   HARV       tower basePlot     1600    343.74   Histosols        40
##                    geometry
## 1  POINT (731405.3 4713456)
## 2  POINT (731934.3 4713415)
## 3  POINT (731754.3 4713115)
## 4  POINT (731724.3 4713595)
## 5  POINT (732125.3 4713846)
## 6  POINT (731634.3 4713295)
## 7  POINT (731754.3 4713295)
## 8  POINT (731784.3 4712845)
## 9  POINT (731904.3 4713595)
## 10 POINT (731514.3 4713235)
# Plot the plot locations
ggplot(data = plot_locations_sp_HARV) +
  geom_sf() +
  ggtitle("Map of Plot Locations")

# Plot the plot locations on canopy height model
ggplot(data = CHM_ov_HARV_df) +
  geom_raster(aes(x = x, y = y, fill = layer)) + 
  geom_sf(data = plot_locations_sp_HARV) +
  scale_fill_gradientn(name = "Canopy Height", colors = terrain.colors(10)) + 
  coord_sf()

8. Extract raster values at points

Often we want to extract pixel values from a raster layer for particular locations - for example, plot locations that we are sampling on the ground.

We can extract pixel values by defining a buffer or area surrounding individual NEON plot point locations using the extract() function. To do this we define the summary argument (fun = mean) and the buffer distance (buffer = 20) which represents the radius of a circular region around each point. By default, the units of the buffer are the same units as the data’s CRS. All pixels that are touched by the buffer region are included in the extract.

Let’s put this into practice by figuring out the mean tree height in the 20m around each of the NEON forest plots (plot_locations_sp_HARV). We will use the df = TRUE argument to return a data frame.

# Extract tree heights at HARV NEON forest plots with 20m buffer
mean_treeHt_plots <- extract(x = CHM_HARV,
                                  y = plot_locations_sp_HARV,
                                  buffer = 20,
                                  fun = mean, 
                                  df = TRUE)
# Look at NEON plot tree heights
mean_treeHt_plots
##    ID     layer
## 1   1        NA
## 2   2 21.062117
## 3   3 11.758839
## 4   4 15.011521
## 5   5 19.519225
## 6   6 17.095406
## 7   7 18.176274
## 8   8 16.991266
## 9   9  9.047444
## 10 10 17.515907
## 11 11 18.600987
## 12 12 17.585708
## 13 13 13.872596
## 14 14 18.673074
## 15 15  9.675175
## 16 16 13.073901
## 17 17 17.053575
## 18 18 15.471521
## 19 19 17.400262
## 20 20 18.015717
## 21 21 19.493514
# Rejoin tree height data to the NEON plots data frame
plot_locations_sp_HARV <- mutate(plot_locations_sp_HARV, 
                                 treeHt_mean = mean_treeHt_plots$layer)

# Look at edited NEON plots data frame with tree height column
plot_locations_sp_HARV
## Simple feature collection with 21 features and 15 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: 731405.3 ymin: 4712845 xmax: 732275.3 ymax: 4713846
## CRS:            +proj=utm +zone=18 +ellps=WGS84 +datum=WGS84 +units=m +no_defs
## First 10 features:
##    geodeticDa utmZone   plotID stateProvi    county domainName domainID
## 1       WGS84     18N HARV_015         MA Worcester  Northeast      D01
## 2       WGS84     18N HARV_033         MA Worcester  Northeast      D01
## 3       WGS84     18N HARV_034         MA Worcester  Northeast      D01
## 4       WGS84     18N HARV_035         MA Worcester  Northeast      D01
## 5       WGS84     18N HARV_036         MA Worcester  Northeast      D01
## 6       WGS84     18N HARV_037         MA Worcester  Northeast      D01
## 7       WGS84     18N HARV_038         MA Worcester  Northeast      D01
## 8       WGS84     18N HARV_039         MA Worcester  Northeast      D01
## 9       WGS84     18N HARV_040         MA Worcester  Northeast      D01
## 10      WGS84     18N HARV_041         MA Worcester  Northeast      D01
##    siteID    plotType  subtype plotSize elevation  soilTypeOr plotdim_m
## 1    HARV distributed basePlot     1600    331.64 Inceptisols        40
## 2    HARV       tower basePlot     1600    341.62 Inceptisols        40
## 3    HARV       tower basePlot     1600    347.61 Inceptisols        40
## 4    HARV       tower basePlot     1600    334.34   Histosols        40
## 5    HARV       tower basePlot     1600    352.93 Inceptisols        40
## 6    HARV       tower basePlot     1600    342.43   Histosols        40
## 7    HARV       tower basePlot     1600    343.11   Histosols        40
## 8    HARV       tower basePlot     1600    353.55 Inceptisols        40
## 9    HARV       tower basePlot     1600    338.11 Inceptisols        40
## 10   HARV       tower basePlot     1600    343.74   Histosols        40
##                    geometry treeHt_mean
## 1  POINT (731405.3 4713456)          NA
## 2  POINT (731934.3 4713415)   21.062117
## 3  POINT (731754.3 4713115)   11.758839
## 4  POINT (731724.3 4713595)   15.011521
## 5  POINT (732125.3 4713846)   19.519225
## 6  POINT (731634.3 4713295)   17.095406
## 7  POINT (731754.3 4713295)   18.176274
## 8  POINT (731784.3 4712845)   16.991266
## 9  POINT (731904.3 4713595)    9.047444
## 10 POINT (731514.3 4713235)   17.515907

Once we extract the raster values at points and reattach it to the data frame, we can use any of the tidying, grouping, and summarizing functions from the tidyverse to further summarize these canopy height data.


Code Challenge 3: Within your group, discuss the analysis steps that you would use to summarize the canopy height data the plot locations to get the mean and standard deviation among plots.



LAB REPORT INSTRUCTIONS: